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Abstract 

We discuss the recently introduced multilevel algorithm for the steady- 
slate solution of Markov chains. The method is based on an aggregation prin- 
ciple which is well established in the literature and features a multiplicative 
coarse- level correction. Recursive application of the aggregation principle, 
which uses an operator-dependent coarsening, yields a multi-level method 
which has been shown experimentally to give results significantly faster than 
t he typical methods currently in use. When cast as a multigriddike method, 
the algorithm is seen to be a Galerkin-Full Approximation Scheme with a 
solution-dependent prolongation operator. Special properties of this prolon- 
gation lead to the cancellation of the computationally intensive terms of the 
coarse-level equations. 
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1 Introduction 

Markov chains describe discrete-state stochastic processes in which the prob- 
abilities of transitions between states are a function solely of the current 
state of the chain the so-called memory less property. Since this property is 
approximately satisfied by many physical systems, Markov chains are used 
widely in stochastic modeling. We will draw our examples in this paper from 
the performance and reliability modeling of computer systems. It is common 
to distinguish between continuous time Markov chains (CTMCs), in which 
transition coefficients between states are interpreted as exponentially dis- 
tributed rates or delays, and discrete time Markov chains (DTMCs), when' 
they are treated as probabilities. In the latter case, the Markov chain is 
described by a stochastic matrix. In the steady-state case, CTMC problems 
can be converted via a simple transformation into problems described bv a 
DTMC. Hit imately. the Markov chain represents a linear system of equations 
which is usually sparse and often extremely large. 

The goal of modeling computer systems is to derive information on per- 
formance, measured typically as job throughput or component utilization, 
and availability, defined as the proportion of time a system is able to per- 
form a certain function in the presence of component failures and possibly 
also repairs. Various abstract modeling tools for computer systems are in 
widespread use today, the most important of which are generalized stochas- 
tic Petri nets (CJSPNs) [1] and queueing networks [6]. When the meinoryless 
condition is satisfied, such models are equivalent to Markov chains, and it 
is required to solve the Markov chain in order to derive useful information 
about the abstract model. 

Unfortunately, the number of states of the Markov chain (and thus the 
dimension of the linear system) grows extremely quickly as the complexity 
of the model is increased. There is one unknown for each state that the 
model may be in - a number that is subject to a combinatorial explosion. 
Thus, the Markov chains that have to be solved even for relatively coarse 
computer models may have tens or hundreds of thousands of states. Apart 
from their size, one further drawback of typical Markov chains is the presence 
of coefficients on a wide range of scales. Consider, for example, a reliability 
model of a computer, in which the rate of component failure may be only once 
in every few months, whereas the rates associated with the normal behaviour 
of the system are measured in kHz and MHz. 
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The resulting large systems of equations must be solved numerically us- 
ing an iterative scheme. Typical iterative methods in use in the computer 
modelling community are the Power. (Jauss-Seidel (CIS), and successive over- 
relaxation (SOR) algorithms. Surveys of currently used methods may be 
found in [12, 8]. All of these methods have the drawback that they may 
require many iterations to reach an accurate solution, particularly if the sys- 
tem is large or if coefficients of strongly varying magnitude are present . This 
can lead to unacceptably long computation times. 

In this paper, we will consider a multilevel (ML) solution algorithm for 
Markov chains, which was introduced in [5]. The method is based on the 
principle of iterative aggregation and disaggregation, a well-established nu- 
merical solution technique for Markov chains [7, lb. 14]. This principle uti- 
lizes a coarse-level level correction that is multiplicative, rather than addi- 
tive. i.e. newly obtained coarse- level values are used as a factor by which 
fine-level approximations are rescaled. Furthermore, the aggregation itself, 
or the coarsening strategy is operator-dependent, in that locally strongly 
coupled states are mapped together. 

Algebraic multigrid (AMG) [13] is considered to be an attractive solution 
strategy for the systems of equations that are unstructured and which may 
have strongly varying coefficients. These are characteristic properties of the 
Markov chains that typically occur in practice. However, as far as we know, 
until now t here has been no AMCJ approach to solving Markov chains. 

It will be shown that the multilevel method is equivalent to an algebraic 
multigrid scheme which uses the CJalerkin method lor the coarse level oper- 
ator and is of Full Approximation Scheme (FAS) type. When viewed as a 
multigrid scheme, the novelty of the multilevel method is seen to stem from 
the definition of tin 1 prolongation operator, which is solution-dependent and 
yields the identity operator when combined from t he left or from the right 
with the restriction operator. This has two interesting effects: the right-hand 
side of the coarse level equations degenerates into a simple rest rict ion ot the 
fine-level right-hand side, and the coarse level operator is solution-dependent 
and therefore changes from iteration to iteration, even though the Markov 
chain problem itself is linear. 

In the following section, we describe the problem and the aggregation 
equations. The multilevel method is described in section 3. In section 4, 
the multilevel method is rewritten and interpreted as a multigrid scheme 1 . In 
section 5 experimental results for Markov chains arising from a well-known 



multiprocessor reliability model and from a simple queueing network are 
presented, showing the superiority of the method over the standard Ga.uss- 
Seidel scheme. The final sec tion summarizes the paper. 

2 Problem Description and Aggregation 
Equations 

Consider an irreducible Markov chain consisting of n states .sj , s 2 , . . . , ,s„. 
Demote the unknown vector by ■</, where tq is the steady-state probability 
of the Markov chain being in stale In order to facilitate the notation 
for the multilevel scheme, we use indices /, / — 1, etc., to indicate levels ol 
aggregation of the Markov chain. The original Markov chain is designated 
to be at level / = hna.r. 

We t hen have to solve the system of equations defined by the Markov 
chain 

A l u = 0 , ( 1 ) 

with the additional condition 

x>: = i, 

i = ] 

where / — Imax. Note that, in the discrete time case, equation (1) is usually 
written as 

7 tP = 7 r (2) 

where 7r = {u 1 ) 1 and P = (A 1 ) 1 + /, which is the so-called transition matrix. 
Equation (2) defines the solution of the Markov chain in terms of an eigen- 
value problem, and since P is a stochastic matrix, we know that it possesses 
the unique maximal eigenvalue A = 1. We will, however, use the notation of 
equation (1) throughout, reserving the symbol P for the prolongation oper- 
ator. In the continuous time case, equation (1) is usually written 

7 tQ = 0 , 

where Q = {A 1 ) 7 . Matrix A 1 has zero column sums and all off-diagonal 
coefficients are non-negative, making it a singular M-matrix of rank // — 1. 




A coarser representation oi the Markov chain described by matrix A 1 may 
be obtained by a<j<jr( (jaiion. This amounts to creating a new Markov chain 
desc ribed by a matrix A l ~ [ with the vector of state probabilities n /_1 . each 
of whose' V states >'i , S 2 * . . . , A.\ is derived by lumping together a numl)er 
of states of the original system. Aggregation is motivated by probabilistic 
arguments which arc' illustrated in figure 1. The 1 figure shows four states 
• s i • - s 2‘ * s 3 < < s \ of a Markov chain, where the probability of the chain being in 
each of the states is given by U \ , u 2 , M3, t/. } , respect i vely. In addition, a transi- 
tion from .sj to .s 3 with the probability A31 is assumed. The 1 probability of the 
c hain being in eil her state' .s, or state s 2 is then given by u x + u 2 * and w<‘ mav 
replace' this pair of state's by the corresponding “macro-st ate v ,S‘i, and simi- 
larly for states .S3 and .s |. The transition A3} is them mapped to a transition 
between macro-state's S\ and S 2 with a probability value' of l.?i e/ 1 /( e/ 1 + u 2 )< 
which re'])re'sents the original transition probability multiplied by the relative 
probability of the Markov chain being in .sj, given that it is the macro-state 
S\ . This argument allows us to generate a coarse' level (aggregated) system) 
from our original Markov chain. 

In t lie' following, we will use the terms jinr U vd and roar . se level to refer to 
Markov chains, where' the' latter is obtained by aggregation from the* former. 
The 1 relation £ Sj signifies that the fine level state' .s; is mapped by the 
aggregation operation to the 1 coarse 1 level state Sj. A set of fine states mapped 
to a common coarse 1 leve 1 ] state' {.s, : ,s, £ Sj) will be 1 referred to as an 
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aggregate. 

The matrix A l ~ l of the aggregated system is thus defined as follows : 


A‘u 



X A\ 

s t €S } 


E 

»j€Sj 


( 3 ) 


This is the classical aggregation matrix. Note that the matrix A l ~' is a 
function not only of the fine level matrix /V, hut also of the fine level solution 
vector a 1 . It will be shown in section 4 that this coarse level matrix is 
equivalent to the Galcrkin operator in the multigrid context, with special 
intergrid transfer operators. 

It is well-known that this aggregation strategy propagates the Markov 
chain property to the coarser level [15], i.e., the matrix A l ~ l is also an irre- 
ducible 1 Markov chain. This yields the aggregated equation in the unknown 


/I'-V" 1 = 0 

E «/■* = i • 

i—\ 

It can then he shown that the solution of the coarse system satisfies 

u 1 ,- 1 = e «:■ - 

s t eSi 

i.e., the probability of being in a coarse level state 1 is the sum of the prob- 
abilities of being in any of its constituent, fine-level states. We will use the 
aggregation equation as a basis for the multilevel method, whereby we ap- 
proximate the exact solution values u l in (3) by values from the current 
iterate. 

The coarse-level matrix depends on the fine-level solution, and must there- 
fore, in the context of an iterative method, be approximated by using the 
values of the current iterate. More precisely, A l ~ l is a funct ion of the relative 
values of the fine-level nodes with respect to the values of their aggregates. 
These are the probabilities of the Markov chain being in a fine-level state 
conditioned on being in the aggregate state. 



The motivation for the 1 multilevel method lies in the observation that, if 
it is possible to obtain improved values for the relative probabilities of fine- 
level states in a common aggregate, then an improved coarse-level system 
can be set up and solved, the solution of which represents the probabilities 
of the aggregates. The argument can. of course, be* applied recursively. We 
choose to form small aggregates composed of strongly-coupled neighboring 
states, as the Gauss-Seidel iteration is able to achieve* an improvement in the 
relative probabilities of such states efficiently. We refer to the Gauss-Seidel 
method in this context as a “smoother”, although its role here is somewhat 
different. This motivat ion parallels that of multigrid: high-frequency errors 
are smoothed out on the fine level, whereas low-frequency errors are reduced 
by the coarse-level correction. 

The solution valuers obtained on the coarser level art* used to rescale the* 
values within each aggregate by the same* factor. This rescaling guarantees 
that the* fine-level solution remains a probability vector, i.e., the* coarse level 
correction produce's new fine-level values in the* range* [0. 1]. 

The coarse level Markov chain thus derived torms the basis of the* 
well-known iterative aggregation-disaggregation algorithms [14] in which an 
aggregate*- wise* block Gauss-Seidel or block Jacobi deration on the fine level 
alternate's with a coarse 1 level correction. These methods bear a strong resem- 
blance to domain decomposition methods for partial differential equations. 

In contrast to these “two-lever' scheme's, we* will develop a multilevel 
solution method by recursive application of aggregation where the problem 
defined at each le*vel represents a Markov chain. The* converged solution value 
at each coarse* level state* is the sum of the converged solution values of its 
constituent fine level state's. It is to he* hoped that similar improvements in 
performance over single-level iterations such as Gauss-S(*iele*l can be obtained 
as is the case for multigrid and elliptic partial differential eepiations. The 
improvement will come* from choosing small aggregates, for which it will be* 
sufficient merely to improve the* relative probabilities using a few (Jauss-Seidel 
sweeps, rather than to use* large* aggregates consisting ol many unknowns, 
and to solve for these value's, which can be* extreme'ly expensive. In addition, 
the* recursive application of the aggregation will allow us capture relative 
probabilities at all scale's. 
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3 Multilevel Solution Algorithm 

In this section, we recall the recently introduced multilevel algorithm [5], 
which is based on a recursive aggregation of the Markov chain to obtain 
approximations of successively smaller dimensions. The algorithm passes 
through all levels of the hierarchy of chains in a multigrid-style V-cycle. The 
coarser level equations a re the aggregation equations of section 2. 

We adopt the following abbreviations for elementwise multiplication and 
division on vectors in UV n : 

a = b*c <=> Oi = hi * 1 < / < m 

a — b/c a t = bi/c-i, 1 < i < m 

In the following, u represents an intermediate vector, it an approximat ion 
to the solution vector, and u~ a correction vector. We denote by (u 1 )^ the 
/ -th iterat<\ One iteration of the faro-level version of the ML algorithm, using 
one relaxation sweep on the fine level, is given by the following sequence of 
stops. 


1. Perform one (Jauss-Seidel relaxation sweeps on the finer level, which 
we denote by 

u l - as((v l ) {,) ) 

2. Restrict the current approximation to the solution to the coarse level, 
where the restriction operator R is defined as summation of the values 
of fine-level states mapped to a common coarse state: 

u l ~ x = R(u l ) # u'r 1 = E • 

R can be represented by a. N x n matrix of zeros and ones. 

3. Compute the coarse level matrix /r 1 as a.n approximation to that of 
equation (3) using the current values of the solut ion vector: 

X x. A>\ 

P_1 _ v,€\/ \ ••*.€■''/ / 

n u ~ 


( 1 ) 



4. Solve the coarse Markov chain problem for u l 1 : 


= 0 , Z = l ■ (5) 

7=1 

5. Compute the coarse- level correction as the ratio of new coarse level 
solution and restricted fine-level solution. In this step, we compute the 
factor bv which the probability of each aggregate 1 must be corrected: 

( U ) = V / It 

6. ( -oiiipute the fine-level correction as a prolongation P of the coarse 1 - level 
correction vector. All fine level states of an aggregate* are corrected by 
the same factor: 

(u‘r = / , ((«'" 1 )*) ^ («'),- = («'■*); -s € Sj . 

P can be represented by a n x A matrix of zeros and one's. 

7. Apply the fine-level correction, i.e., rescale the aggregate* probabilities 
to obtain the new / + lth iterate: 


In this two-level form, the method is similar to the* iterative* aggregation- 
disaggregation ( 1 A 1 ) ) method of Koury, McAllister and Stewart [7], except 
that a pointwise. rather than a block CJauss-Seideh approach is used. 1 he* 
multilevel algorithm is obtained by recursive application of the two-level al- 
gorithm to obtain a solution to the aggregated ('({nation (o). It is described 
in algorithmic form in figure* *2. The* coarse* level / — 1 and fine level /, between 
which t he operators P and P map, are* identified by appropriate indices. We 
allow in general the possibility of applying GS i f \ time's at each level as a 
pre-smoothing step and Vi times as a post-smoothing step. 

The multilevel method is identical in st ructure and similar in mot ivat ion 
to a standard multigrid V -cycle. The principal difference* resides in the* 
derivation of the fine-level correction from the coarse* level solution vector. 
Whereas here the correction is a multiplication by the* prolonged ratio of new- 
to-old coarse level solutions, in standard multigrid it is the addition of the 
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Figure 2: Multilevel Algorithm 

prolonged difference between these two vectors. However, in the following 
sect ion, by absorbing the current approximation vector into the prolongation 
operator, we can interpret the ML method as a multigrid scheme. 

As is the case for algebraic multigrid [13], we use an aggregation strategy 
that maps strongly-coupled fine-level states to a common coarse level state. 
In general, aggregation is pairwise, but aggregates consisting of three or four 
states, or even as few as one state, are permitted if the coefficients so de- 
mand. When the Markov chain contains strongly differing rates - which is 
somewhat analogous to the presence of strong local convection or anisotropy 
in PDFs - the convergence rate of the ML scheme is sensitive to the aggrega- 
tion strategy. We use a greedy algorithm to determine the aggregates whose 
complexity is linear in the number of edges of the Markov graph. 

4 Interpretation as a multigrid method 

The mult ilevel method is based on the iterative aggregation-disaggregation 
strategy, which dates at least from 1975 [16] and whose equations are derived 
in a natural wav by probability arguments. In this section, we will show that 


9 




the scheme can he written as a classical algebraic multigrid algorithm and 
point, out the particular choices of multigrid components that the ML scheme 
represents. 

We begin by considering the multiplicative coarse level correction as the 
composition of steps o, (j. and 7 in the algorithm of section T This correction 
is defined by 


tl ; — it; * 


/-] 


1 


■ s i € Sj . 


where the new coarse level solution is used to scale the fine-level solution 
values in each aggregate by the same factor. The coarse grid correction 
in multigrid is, however, generally written as an additive correction, so we 
rewrite' equation (7) as 


— y/ ! + 1 — n \ 1 Itjtt 

U i 


\/ - 1 


Si 


Thus, the* scaling of the fine solution by the prolongation of the ratio of new 
and old coarse' level solutions is equivalent to an additive correction using 
the prolonged difference between the two coarse vectors scaled by a solution- 
dependent factor idjafi 1 . This strategy ensures that the correction step will 
automatically produce new fine- level solution values t hat remain bounded in 
the interval [0, 1]. In addition, we observe that the relative probabilities of 
the state's within each aggregate are unaffected by the coarse level correction. 

Observation 1 H e may write the ML method with an additive, rather than 
multiplicative, correction, using the prolongation opt rat or P given as 


P = DP . 


where P is the standard multigrid prolongation 


P = IP 


and I) is defined by 


I) = dia 


(llu 1 ). 
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Tims, the prolongation operator is equivalent to the standard nmltigrid 
prolongation operator multiplied by the solution-dependent diagonal matrix 
D. Strict ly speaking, we should therefore represent the dependency of P on 
a 1 by writing P {l i , but we generally omit the suffix in the interest of simplicity. 
W e now make some observations on the thus-defined multilevel algorithm. 


Observation 2 Tin prolongation arid restriction operators satisfy th( fol- 
lowing conditio t r s : 


I’ + If. (S) 

PRu‘ = I l u‘ , (9) 

RPv 1 - 1 = /'"V -1 , (10) 


where / . / 1 are the identity operators on If vt Is I and l — 1. respectively. 

Proof of (9): 


(PRC), 


, E «' 


y a - 

.s,6S; 


u‘ 


□ 


Properties (8) - (10) are in contrast to the usual case in multigrid. Prop- 
erty (9) has, perhaps, the most interesting consequence, as Observation 5 
below will show. 


Observation 3 Tin coarse level system defined by A 1 1 from (3) is equiva- 
lent to tin (ialerkin approximation to A 1 defined by 

A 1 " 1 = RA l P . (11) 


Proof: We may compute tin (JJ) coefficient of the matrix RA l P as the 
I -th element of the vector RA l Pe l ~ x where c /_1 = (0. . . . , 0. 1, 0. 0) 

and the 1 is in the J -th position. This gives 


(tf/t'/V" 1 )/ = 


E 


U: 




E 4 


\ijesj 


j s,es, 


which is equivalent to ('■!). 
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From Observations 1 and 3, we draw the following: 

Observation 4 The ML method is equivalent to a certain FAS Galtrkin 
m u 1 t igi i d i t e i a t i o n . 

Some of the above relationships have been noted by previous authors; 
Haviv [4] discusses various iterative aggregation schemes, pointing out (9) 
and (11), and Krieger [8] points out the relationship between IAD schemes 
and two-level multigrid algorithms. 

If we consider applying the equivalent multigrid algorit hm to a non singu- 
lar problem with a noil-trivial right hand side, such as a discretized Poisson 
equation, we obtain the following FAS-Galerkin coarse level equation: 

RA l P{-a 1 - 1 ) = Rf l - RA l id + JLYFIiid , (12) 

for which we may note the iollowing. 

Observation 5 Property (9) of the ML prolongation operator leads to can- 
cellation of tin second and third terms in the right hand side of (Id), yielding 
the rector Iff 1 which is constant and may be precomputed: 

RA l P{h 1 -' ) = R,f l ■ 

In traditional mult igrid met, hods the coarse system is driven by the chang- 
ing right hand side (the restricted fine-level defect ). For linear problems, the 
coarse matrix is constant and may be precomputed. In the present method, 
the situation is reversed: the 1 forcing function is constant and successive 
coarse solutions are driven by a changing system matrix. The computational 
saving of tin* latter scheme in the evaluation of the right hand side is sub- 
stantial: one fine and one coarse matrix-vector product, per iteration. 

The convergence properties ol the ML method will depend on the quality 
of the approximation of the coarse 1 solution. Since the coarse equations an 1 
solution-dependent and vary from iteration to iteration, it is worthwhile to 
ask how well the coarse 1 matrix (4) approximates the converged matrix (3) 
and, therefore, how close the computed solution id~ l can be to the converged 
solution id~ ] . 

Definition 1 117 de fine an approximation id to the exact solution id to be 
smooth , if the relative magnitudes of all solution values within each aggregate 
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arc in error by no more than 0{i ): 


u 


£ 

*jes, 


£ « 


< 0(C 


s i € *S/ 


(13) 


Since aggregates are composed of neighboring fine-level states, this definition 
of smoothness is consistent with that of algebraic smoothness in the con- 
text of algebraic multigrid [13]. The smoothing property (reduction of high- 
frequency algebraic errors) is equivalent to achieving approximately correct 
relative' magnitudes of neighboring solution values. Recall that the quality 
of the coarse level matrix (1) depends only on the relative, rather than the 
absolute, sizes of the solution values in each aggregate. 

Using the eigenvalue problem formulation of equation (1), where B = 
(A l ~ l + /) 7 , we write the coarse problem 


Hu 1 - 1 = v l ~ l . ( 14 ) 

At any iteration of t he ML method until convergence is reached, the coarse 
matrix will differ from its converged value. We write the matrix computed 
during the ML iteration from u l as a perturbation B + A = ( A /_1 + I) 1 of 
the converged value B. The approximate coarse system at any iteration is 
thereto re 

(# + A)u'-' =« , -‘ . (15) 

Note that A has the same sparsity pattern as B. 

Observation 6 If ( IS) is satisfied , then the error matrix A satisifi s 
A = (9(0* Equations ( to) and (I/ f ) yield 

= h '- 1 + 0 (<) . ( 16 ) 

The coarse solution is therefore also in error by only (9(c). The exact form 
of the error term in equation (16) can be found in [9]. We conclude that 
an algebraically smooth fine-level approximation will yield a coarse system 
whose solution is an acceptable approximat ion to that of the converged state. 
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Figure T Simple Multiprocessor Kxample 


5 Experimental Results 

In computer modelling, Markov chains are seldom developed explicitly; their 
size alone would in general make this an impractical task. Instead, more 
abstract modelling paradigms are used, the most important of which are 
queueing networks and stochastic Petri-nets. In order to demonstrate the 
gain in efficiency of the ML method over (IS, we therefore solve Markov 
chains defined indirectly via these modelling tools. We choose a. stochastic 
Petri-net model of a mult iprocessor from the literature [l], a small queueing 
network, a well-known queueing network model of a multi-user computing 
system [15], and a stochastic Petri-net model of a multi-tasking operating 
system that has recently appeared ['?]. The derivation of Markov chains from 
stochastic Petri-nets is described in [L 11] and from queueing networks in 
[ 6 ] - 

Figure 2 shows a multiprocessor system which consists of n processors Pr 

1 , Pr 2 Pr n, each with a private memory unit PM 1, PM 2, . . .. PM n. to 

which they have direct access via a local bus. The processors communicate 
via two common memory units CM 1 and CM 2. The processors compete' 
for access to the two common memory units via a global bus (JB. Ajmone 
Marsan. Balbo and Conte [1] give a. CSPN model of this multiprocessor (the 
structure of which is shown in figure 1) which allows for the possibility of 
failure and repair of the processors, the bus and the memory units. The 
model allows computation of t he loss of effective computational power oi the 
processors due to downtime and competition for tin* system resource's. 

Figure 5 shows the computational work of the CS and ML methods ap- 
plied to this problem, where the numerical values ol the parameters are taken 
from [1]. We show the* total number of millions ol floating point operations 
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needed to solve the problem as a function of problem size measured as the 
number of processors in the model. The number of states of t he Markov 
(Tains varied from 91 (2 processors) to 3883 (10 processors). All problems 
were solved to an accuracy of ||/lu||, x , < lr — 10. 

Comparing the (IS and ML schemes, we see that, although these are 
problems of very small size, the saving in computational effort of ML over 
CIS is quite dramatic: a factor of 27 for the smallest and of 109 for the largest 
problems considered. It is also clear that the gap widens as the problem size 
increases. 

The SOR method, which is usually used as a solver in software tools for 
stochastic Petri nets [2, 10], does not improve the sit uation for this problem, 
because the optimum overrelaxat ion parameter is found to Ik* oik*. 

Figure 0 shows a small open queueing system consisting of three single- 
server queues, as might typically be used in a computer performance model. 
Server SI with service rate 90 represents a CPU which receives jobs at a mean 
rate of 10 from the outside world for processing. There is a 70% probability 
that jobs leaving SI may then leave the system. Servers S2 and S3 might 
represent I/O devices with service rates 7 and 5, respectively. There is a 10% 
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Figure 5: ( Computational work for (!S (upper curve) and ML (lower curve) 
to solve the Ajmone Marsan/Balbo/( -onto problem. 

chance that a job leaving S3 returns to SI for further processing. The queues 
have a finite capacity c and reject incoming jobs when full. We assume t hat 
job arrival times and service rates are exponentially distributed, allowing us 
to model this system bv a Markov chain. 

For this problem, the Markov chain has a regular three-dimensional struc- 
ture which is somewhat analogous to that of a finite-element discretization 
of a PDF. Each state of the chain may be characterized by the vector 
n 2 , »vsb where //, denotes the number of jobs in queue /. which may 
also be interpreted as a coordinate' index in the /-th dimension of the three- 
dimensional "‘grid" of states. The transitions contained in the chain an 4 the 
following: 


(>[. » 2 - »:*) 
("I* " 2 * ”:*) 
("l, »:*) 

(U[, y/j, n*) 


(th + L tl' 2 * n:\) 

(th - E " 2 - n:\) 

(til - 1. ti 2 + 1. n :K ) 
(th ~ 1 • ti 2 . H;\ + 1 ) 

(;/] + 1 . n 2m n > - 1 ) 
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10 

> 
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Figure (>: Queueing system test case 


"2- ":0 ~ > (»i - »2 ” U »-:i) 

(;/l, u 2 . ?*:*) — ► (»i, ~ U 

where 0 < < e and transitions to states with negative- valued index 

are disallowed. 

The 4 computational work, measured in millions of floating point opera- 
tions, required by (IS and ML to solve the queueing problem is shown in 
figure 7. For this problem also, ML is more than ten times faster than CIS, 
alt hough the “sidelengtli” of the largest Markov chain considered is only 40. 

Figure 8 shows the computational effort, measured as millions of floating 
point operations carried out, for the solution of a computer multiprogram- 
ming model which is due to Stewart [15] with the Gauss-Seidel and multilevel 
met hods. The model describes a computing system consisting of a CPU and 
two I/O devices and the flow of jobs in this system that are initiated bv 
a number of users typing commands at terminals. We used the parameter 
values as in [15]. This model is of nearly completely decomposable type, i.e., 
it is described by a, matrix that is close to block diagonal. The problem is 
scaled by increasing the number of jobs in the system. Already for 15 jobs, 
where the Markov chain has 810 unknowns, the multilevel method is a factor 
of 1083 more efficient than Gauss-Seidel. The improvement grows with the 
problem size. 

Figure 9 shows numerical results for a Markov chain derived from a 
stochastic Petri-net model of an operating system due to Greiner et al [3]. 
The model represents the state changes of processes in a computer with a 
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Figure 7: ( -omputat ional work for (!S (u])|)er curve), ML (lower curve) to 
solve the queueing problem. 


multitasking operating system. The problem can be scaled by increasing 
the number of jobs that arc* in the system simultaneously. On the left, the 
computational effort of (IS and ML, measured in millions of floating point 
operations, is shown as a function of problem size. The performance' im- 
provement. of ML is once again seen to increase with problem size; at 8 jobs 
ML is 70 times faster than (IS. On the right side 1 of Figure 9, the convergence 
factor is shown. The factor for ML is a constant 0.16 for all problems other 
than the* smallest, whereas that of (IS deteriorates from 0.975 to 0.998 over 
the interval. 


6 Conclusions 

We have discussed the recently introduced multilevel solution method for 
the steady state analysis of Markov chains, an important class of problems 
in the stochastic modelling of physical systems. The method is motivated as a 
generalization of well-known iterative* aggregation-disaggregation techniques 
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Figure 8: Numerical Results for Interactive Multitasking Model. Upper 
curve: (Jauss-Seidel; Lower curve: Multilevel. 


to include multiple levels. 

It is shown that the algorithm is equivalent, to an FAS multigrid method 
with a Galerkin-style coarse grid operator and a solution-dependent prolon- 
gation operator. The latter property of the scheme ensures that the coarse 
grid correction product's solution values that are bounded between zero and 
one and that it leaves the relative probabilities within aggregates unchanged. 
This may prove useful for the solution of PDFs with similar restraints on the 
solution, including, for example, mass fraction problems, where traditional 
multigrid coarse grid corrections may produce under- and over-shoots. 

The algorithm is shown to perform well compared to currently used algo- 
rithms, obtaining performance improvements of up to three orders of magni- 
tude on the problems selected. 

Further work will include a more U multigrid-like v coarsening strategy and 
prolongation operator which is more similar to an interpolation. In this 
manner, it is hoped that the poor performance of the multilevel method for 
homogeneously structured problems with very smooth solutions, which has 
been observed, can be improved. 
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